Add vLLM Qwen3.5-27B-FP8 support + reasoning field fix by galic1987 · Pull Request #78 · architehc/selfware

galic1987 · 2026-03-15T22:29:13Z

Summary

Support vLLM "reasoning" field alongside existing "reasoning_content" (SGLang/llama.cpp) via serde alias — backward compatible
Handle "content": null in API responses when thinking mode consumes all tokens
Add dual RTX 4090 vLLM serving config to README (262K context, fp8 KV, hybrid Mamba/Attention)
SAB scorecard: 95/100 weighted (17 BLOOM, 2 GROW, 1 FROST) on Qwen3.5-27B-FP8

Test plan

cargo test --lib -- api::types — 32/32 pass
cargo test --lib -- api:: — 171/171 pass
Full SAB 20-scenario benchmark: 95/100 weighted BLOOM
Verified backward compat with SGLang reasoning_content field

🤖 Generated with Claude Code

- Support both "reasoning" (vLLM) and "reasoning_content" (SGLang/llama.cpp) response fields via serde alias - Handle null content in API responses when model uses all tokens for thinking - Add dual RTX 4090 vLLM serve config to README (262K context, fp8 KV cache) - Add SAB benchmark config and scorecard: 95/100 weighted (17 BLOOM, 2 GROW) - Update selfware.toml for local Qwen3.5-27B-FP8 endpoint Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

galic1987 and others added 2 commits March 15, 2026 12:33

[AGENT CHECKPOINT] Test no auto branch

59bc6ff

galic1987 merged commit 1be0cf5 into main Mar 15, 2026
6 of 13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add vLLM Qwen3.5-27B-FP8 support + reasoning field fix#78

Add vLLM Qwen3.5-27B-FP8 support + reasoning field fix#78
galic1987 merged 2 commits intomainfrom
agent-20260315-123322

galic1987 commented Mar 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

galic1987 commented Mar 15, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant